56 research outputs found

    Timing in trace conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus) : scalar, nonscalar, and adaptive features

    Get PDF
    Using interstimulus intervals (ISIs) of 125, 250, and 500 msec in trace conditioning of the rabbit nictitating membrane response, the offset times and durations of conditioned responses (CRs) were collected along with onset and peak latencies. All measures were proportional to the ISI, but only onset and peak latencies conformed to the criterion for scalar timing. Regarding the CR’s possible protective overlap of the unconditioned stimulus (US), CR duration increased with ISI, while the peak’s alignment with the US declined. Implications for models of timing and CR adaptiveness are discussed

    Reward context determines risky choice in pigeons and humans

    Get PDF
    Whereas humans are risk averse for monetary gains, other animals can be risk seeking for food rewards, especially when faced with variable delays or under significant deprivation. A key difference between these findings is that humans are often explicitly told about the risky options, whereas non-human animals must learn about them from their own experience. We tested pigeons (Columba livia) and humans in formally identical choice tasks where all outcomes were learned from experience. Both species were more risk seeking for larger rewards than for smaller ones. The data suggest that the largest and smallest rewards experienced are overweighted in risky choice. This observed bias towards extreme outcomes represents a key step towards a consilience of these two disparate literatures, identifying common features that drive risky choice across phyla

    Anti-social motives explain increased risk aversion for others in decisions from experience

    Get PDF
    When deciding for others based on explicitly described odds and outcomes, people often have different risk preferencs for others than for themselves. In two pre-registered experiments, we examine risk preference for others where people learn about the odds and outcomes by experiencing them through sampling. In both experiments, on average, people were more risk averse for others than for themselves, but only when the risky option had a higher expected value. Furthermore, based on a separate set of choices, we classified people as pro- or anti-social. Only those people classified as anti-social were more risk averse for others, whereas those classified as prosocial chose similarly for themselves and others. When the uncertainty was removed, however, all participants exhibited less anti-social behavior. Together, these results suggest that anti-social motives contribute to the observed limited risk taking for others and that outcome uncertainty facilitates the expression of these motives

    Habits without values

    Get PDF
    Habits form a crucial component of behavior. In recent years, key computational models have conceptualized habits as arising from model-free reinforcement learning (RL) mechanisms, which typically select between available actions based on the future value expected to result from each. Traditionally, however, habits have been understood as behaviors that can be triggered directly by a stimulus, without requiring the animal to evaluate expected outcomes. Here, we develop a computational model instantiating this traditional view, in which habits develop through the direct strengthening of recently taken actions rather than through the encoding of outcomes. We demonstrate that this model accounts for key behavioral manifestations of habits, including insensitivity to outcome devaluation and contingency degradation, as well as the effects of reinforcement schedule on the rate of habit formation. The model also explains the prevalent observation of perseveration in repeated-choice tasks as an additional behavioral manifestation of the habit system. We suggest that mapping habitual behaviors onto value-free mechanisms provides a parsimonious account of existing behavioral and neural data. This mapping may provide a new foundation for building robust and comprehensive models of the interaction of habits with other, more goal-directed types of behaviors and help to better guide research into the neural mechanisms underlying control of instrumental behavior more generally

    Information seeking as chasing anticipated prediction errors

    Get PDF
    When faced with delayed, uncertain rewards, humans and other animals usually prefer to know the eventual outcomes in advance. This preference for cues providing advance information can lead to seemingly suboptimal choices, where less reward is preferred over more reward. Here, we introduce a reinforcement-learning model of this behavior, the anticipated prediction error (APE) model, based on the idea that prediction errors themselves can be rewarding. As a result, animals will sometimes pick options that yield large prediction errors, even when the expected rewards are smaller. We compare the APE model against an alternative information-bonus model, where information itself is viewed as rewarding. These models are evaluated against a newly collected dataset with human participants. The APE model fits the data as well or better than the other models, with fewer free parameters, thus providing a more robust and parsimonious account of the suboptimal choices. These results suggest that anticipated prediction errors can be an important signal underpinning decision making

    Comparative inspiration : from puzzles with pigeons to novel discoveries with humans in risky choice

    Get PDF
    Both humans and non-human animals regularly encounter decisions involving risk and uncertainty. This paper provides an overview of our research program examining risky decisions in which the odds and outcomes are learned through experience in people and pigeons. We summarize the results of 15 experiments across 8 publications, with a total of over 1300 participants. We highlight 4 key findings from this research: (1) people choose differently when the odds and outcomes are learned through experience compared to when they are described; (2) when making decisions from experience, people overweight values at or near the ends of the distribution of experienced values (i.e., the best and the worst, termed the “extreme-outcome rule”), which leads to more risk seeking for relative gains than for relative losses; (3) people show biases in self-reported memory whereby they are more likely to report an extreme outcome than an equally-often experienced non-extreme outcome, and they judge these extreme outcomes as having occurred more often; and (4) under certain circumstances pigeons show similar patterns of risky choice as humans, but the underlying processes may not be identical. This line of research has stimulated other research in the field of judgement and decision making, illustrating how investigations from a comparative perspective can lead in surprising directions

    When good news leads to bad choices

    Get PDF
    Pigeons and other animals sometimes deviate from optimal choice behavior when given informative signals for delayed outcomes. For example, when pigeons are given a choice between an alternative that always leads to food after a delay and an alternative that leads to food only half of the time after a delay, preference changes dramatically depending on whether the stimuli during the delays are correlated with (signal) the outcomes or not. With signaled outcomes, pigeons show a much greater preference for the suboptimal alternative than with unsignaled outcomes. Key variables and research findings related to this phenomenon are reviewed, including the effects of durations of the choice and delay periods, probability of reinforcement, and gaps in the signal. We interpret the available evidence as reflecting a preference induced by signals for good news in a context of uncertainty. Other explanations are briefly summarized and compared

    The power of nothing : risk preference in pigeons, but not people, is driven primarily by avoidance of zero outcomes

    Get PDF
    When making risky decisions, people and pigeons often show similar choice patterns. When people learn the reward probabilities through repeated exposure to the outcomes, their preference is disproportionately influenced by the extreme (highest and lowest) outcomes occurring in the decision context. Overweighting of these extremes increases preference for risky alternatives that lead to the highest outcome and decreases preference for risky alternatives that lead to the lowest outcome, termed the extreme-outcome rule. This rule predicts greater risk seeking for choices between safe and risky high-value outcomes than for choices between safe and risky low-value outcomes, when both choices occur in the same context. In a series of studies, we examine how this extreme-outcome rule generalizes within and across two evolutionary distant species: pigeons (Columba livia) and humans (Homo sapiens). Both species showed risky choices consistent with the extreme-outcome rule when a low-value risky option could yield an outcome of zero. When all outcome values were increased such that none of the options could lead to zero, people but not pigeons were still consistent with the extreme-outcome rule. Unlike people, pigeons no longer avoided a low-value risky option when it yielded a non-zero food outcome. These results suggest that, despite some similarities, different mechanisms underlie risky choice in pigeons and people

    Intertrial unconditioned stimuli differentially impact trace conditioning

    Get PDF
    Three experiments assessed how appetitive conditioning in rats changes over the duration of a trace conditioned stimulus (CS) when unsignaled unconditioned stimuli (USs) are introduced into the intertrial interval. In Experiment 1, a target US occurred at a fixed time either shortly before (embedded), shortly after (trace), or at the same time (delay) as the offset of a 120-s CS. During the CS, responding was most suppressed by intertrial USs in the trace group, less so in the delay group, and least in the embedded group. Unreinforced probe trials revealed a bell-shaped curve centered on the normal US arrival time during the trace interval, suggesting that temporally-specific learning occurred both with and without intertrial USs. Experiments 2a and 2b confirmed that the bulk of the trace CS became inhibitory when intertrial USs were scheduled, as measured by summation and retardation tests, even though CS offset evoked a temporally precise conditioned response. Thus, an inhibitory CS may give rise to new stimuli specifically linked to its termination, which were excitatory. A modification to the micostimulus temporal difference model is offered to account for the data
    corecore